Output files written to ../out.
No code.
TPR and FPR tables determine sensitivity and specificity using the score cutoffs as desribed in the manuscript. The sensitivity and specificity from the ROC curves generated by the pROC package are for visualization purposes only.
| classifier | auc | pauc | n_cases | n_controls | pretty_label |
|---|---|---|---|---|---|
| Allelic imbalance | 0.64 | 0.61 | 833 | 560 | Allelic imbalance; AUC = 0.64, pAUC [Sp: 1-0.98] = 0.61 |
| Clinical data | 0.56 | 0.5 | 815 | 551 | Clinical data; AUC = 0.56, pAUC [Sp: 1-0.98] = 0.5 |
| Fragment endpoints | 0.64 | 0.59 | 833 | 560 | Fragment endpoints; AUC = 0.64, pAUC [Sp: 1-0.98] = 0.59 |
| Fragment lengths | 0.68 | 0.63 | 833 | 560 | Fragment lengths; AUC = 0.68, pAUC [Sp: 1-0.98] = 0.63 |
| SCNA | 0.69 | 0.64 | 833 | 560 | SCNA; AUC = 0.69, pAUC [Sp: 1-0.98] = 0.64 |
| SCNA-WBC | 0.7 | 0.64 | 833 | 560 | SCNA-WBC; AUC = 0.7, pAUC [Sp: 1-0.98] = 0.64 |
| SNV | 0.63 | 0.58 | 833 | 560 | SNV; AUC = 0.63, pAUC [Sp: 1-0.98] = 0.58 |
| SNV-WBC | 0.65 | 0.66 | 833 | 560 | SNV-WBC; AUC = 0.65, pAUC [Sp: 1-0.98] = 0.66 |
| WG methylation | 0.73 | 0.67 | 833 | 560 | WG methylation; AUC = 0.73, pAUC [Sp: 1-0.98] = 0.67 |
| classifier | auc | pauc | n_cases | n_controls | pretty_label |
|---|---|---|---|---|---|
| Allelic imbalance | 0.58 | 0.59 | 464 | 362 | Allelic imbalance; AUC = 0.58, pAUC [Sp: 1-0.98] = 0.59 |
| Clinical data | 0.55 | 0.5 | 457 | 358 | Clinical data; AUC = 0.55, pAUC [Sp: 1-0.98] = 0.5 |
| Fragment endpoints | 0.65 | 0.56 | 464 | 362 | Fragment endpoints; AUC = 0.65, pAUC [Sp: 1-0.98] = 0.56 |
| Fragment lengths | 0.69 | 0.62 | 464 | 362 | Fragment lengths; AUC = 0.69, pAUC [Sp: 1-0.98] = 0.62 |
| Pan-feature | 0.71 | 0.66 | 464 | 362 | Pan-feature; AUC = 0.71, pAUC [Sp: 1-0.98] = 0.66 |
| SCNA | 0.68 | 0.6 | 464 | 362 | SCNA; AUC = 0.68, pAUC [Sp: 1-0.98] = 0.6 |
| SCNA-WBC | 0.7 | 0.64 | 464 | 362 | SCNA-WBC; AUC = 0.7, pAUC [Sp: 1-0.98] = 0.64 |
| SNV | 0.62 | 0.56 | 464 | 362 | SNV; AUC = 0.62, pAUC [Sp: 1-0.98] = 0.56 |
| SNV-WBC | 0.68 | 0.64 | 464 | 362 | SNV-WBC; AUC = 0.68, pAUC [Sp: 1-0.98] = 0.64 |
| WG methylation | 0.71 | 0.63 | 464 | 362 | WG methylation; AUC = 0.71, pAUC [Sp: 1-0.98] = 0.63 |
NULL png 2
NULL png 2
NULL png 2
NULL png 2
92.9% (11705 of 12601) of nonsynonymous SNVs in cfDNA that were matching in corresponding participant WBC were private to an individual participant.
| my_var | train_Invasive Cancer | train_Non-cancer | valid_Invasive Cancer | valid_Non-cancer |
|---|---|---|---|---|
| Total (%) | 854 (100%) | 560 (100%) | 485 (100%) | 362 (100%) |
| Female | 594 (70%) | 436 (78%) | 307 (63%) | 235 (65%) |
| age (sd) | 61 (12) | 60 (12) | 62 (12) | 59 (14) |
| age 50+ | 710 (83%) | 452 (81%) | 414 (85%) | 274 (76%) |
| Race/Ethnicity | ||||
| Black or African American | 54 (6%) | 46 (8%) | 33 (7%) | 25 (7%) |
| Hispanic | 43 (5%) | 29 (5%) | 31 (6%) | 22 (6%) |
| Other/unknown | 20 (2%) | 12 (2%) | 21 (4%) | 7 (2%) |
| White, non-Hispanic | 737 (86%) | 473 (84%) | 400 (82%) | 308 (85%) |
| Smoking status, n (%) | ||||
| Ever-smoker/Missing | 445 (52%) | 238 (42%) | 242 (50%) | 179 (49%) |
| Never-smoker | 409 (48%) | 322 (57%) | 243 (50%) | 183 (51%) |
| Body Mass Index, n (%) | ||||
| Missing | 1 (0%) | 1 (0%) | ||
| Normal/Underweight | 237 (28%) | 147 (26%) | 139 (29%) | 84 (23%) |
| Obesity | 341 (40%) | 233 (42%) | 186 (38%) | 154 (43%) |
| Overweight | 275 (32%) | 180 (32%) | 160 (33%) | 123 (34%) |
| Site Region, n (%) | ||||
| Midwest | 150 (18%) | 83 (15%) | 127 (26%) | 64 (18%) |
| Northeast | 46 (5%) | 53 (9%) | 26 (5%) | 25 (7%) |
| South | 491 (57%) | 346 (62%) | 228 (47%) | 184 (51%) |
| West | 167 (20%) | 78 (14%) | 104 (21%) | 89 (25%) |
| Cancer Stage, n (%) | ||||
| I | 289 (34%) | 163 (34%) | ||
| II | 239 (28%) | 141 (29%) | ||
| III | 159 (19%) | 75 (15%) | ||
| IV | 157 (18%) | 93 (19%) | ||
| Non-informative | 10 (1%) | 13 (3%) | ||
| Dx Method, n (%) | ||||
| Clinical presentation | 561 (66%) | 317 (65%) | ||
| Screening | 293 (34%) | 167 (34%) |
No code
| classifier_name | Training set | Validation set |
|---|---|---|
| Clinical data | 2.7% (22/815) [1.7%-4.1%] | 2.6% (12/457) [1.4%-4.5%] |
| SNV | 19% (159/833) [16%-22%] | 16% (75/464) [13%-20%] |
| Fragment endpoints | 22% (181/833) [19%-25%] | 18% (84/464) [15%-22%] |
| Allelic imbalance | 25% (210/833) [22%-28%] | 22% (101/464) [18%-26%] |
| SCNA | 33% (271/833) [29%-36%] | 27% (125/464) [23%-31%] |
| Fragment lengths | 28% (236/833) [25%-32%] | 29% (136/464) [25%-34%] |
| SCNA-WBC | 33% (278/833) [30%-37%] | 30% (139/464) [26%-34%] |
| SNV-WBC | 36% (299/833) [33%-39%] | 33% (155/464) [29%-38%] |
| WG methylation | 39% (328/833) [36%-43%] | 34% (158/464) [30%-39%] |
| Pan-feature | NA | 36% (165/464) [31%-40%] |
| classifier_name | Training set | Validation set |
|---|---|---|
| Clinical data | 2% (11/551) [1%-3.5%] | 2.2% (8/358) [0.97%-4.4%] |
| Other classifiers | 2.1% (12/560) [1.1%-3.7%] | 2.2% (8/362) [0.96%-4.3%] |
| Predictor | Cancer type | Cancer type + stage | Cancer type + cTAF | Cancer type + cTAF + stage |
|---|---|---|---|---|
| (Intercept) | 0.3 | 3.9e-08 *** | 4.1e-18 *** | 5.4e-12 *** |
| Breast | 0.019 * | 0.9 | 0.53 | 0.32 |
| Lung | 0.038 * | 0.03 * | 0.62 | 0.48 |
| Colon/Rectum | 0.056 . | 0.046 * | 0.58 | 0.44 |
| Stage II | - | 0.00018 *** | - | 0.48 |
| Stage III | - | 2.3e-11 *** | - | 0.32 |
| Stage IV | - | 2.4e-14 *** | - | 0.32 |
| log10_ctaf | - | - | 1.9e-19 *** | 4.8e-17 *** |
| AIC | 545.9 | 448.7 | 201.2 | 204.6 |
| AIC - min(AIC) | 344.7 | 247.5 | 0 | 3.4 |
| dataset | pauc |
|---|---|
| Clinical data | 0.5 |
| Fragment endpoints | 0.56 |
| SNV | 0.56 |
| Allelic imbalance | 0.59 |
| SCNA | 0.6 |
| Fragment lengths | 0.62 |
| WG methylation | 0.63 |
| SCNA-WBC | 0.64 |
| SNV-WBC | 0.64 |
| Pan-feature | 0.66 |
| classifier_name | train | valid |
|---|---|---|
| Allelic imbalance | 0.5662 | 0.5792 |
| Clinical data | 0.7665 | 0.7395 |
| Fragment endpoints | 0.002322 | 0.002439 |
| Fragment lengths | 0.7325 | 0.7151 |
| Pan-feature | NA | 0.8074 |
| SCNA | 0.722 | 0.7923 |
| SCNA-WBC | 0.6936 | 0.7058 |
| SNV | 0.6651 | 0.7501 |
| SNV-WBC | 0.5448 | 0.5448 |
| WG methylation | 0.7535 | 0.826 |
| classifier_name | train_or_valid | I | II | III | IV |
|---|---|---|---|---|---|
| Allelic imbalance | train | 6.0% [3.5-9.4]% (17/284) | 10.2% [6.6-14.8]% (24/236) | 38.5% [30.8-46.6]% (60/156) | 69.4% [61.6-76.5]% (109/157) |
| Allelic imbalance | valid | 4.4% [1.8-8.8]% (7/160) | 14.2% [8.9-21.1]% (20/141) | 32.9% [22.1-45.1]% (23/70) | 54.8% [44.2-65.2]% (51/93) |
| Fragment endpoints | train | 3.2% [1.5-5.9]% (9/284) | 9.7% [6.3-14.3]% (23/236) | 26.9% [20.1-34.6]% (42/156) | 68.2% [60.3-75.4]% (107/157) |
| Fragment endpoints | valid | 3.8% [1.4-8.0]% (6/160) | 7.1% [3.5-12.7]% (10/141) | 24.3% [14.8-36.0]% (17/70) | 54.8% [44.2-65.2]% (51/93) |
| Fragment lengths | train | 3.2% [1.5-5.9]% (9/284) | 16.5% [12.0-21.9]% (39/236) | 45.5% [37.5-53.7]% (71/156) | 74.5% [67.0-81.1]% (117/157) |
| Fragment lengths | valid | 6.2% [3.0-11.2]% (10/160) | 17.0% [11.2-24.3]% (24/141) | 52.9% [40.6-64.9]% (37/70) | 69.9% [59.5-79.0]% (65/93) |
| Pan-feature | valid | 8.8% [4.9-14.2]% (14/160) | 19.9% [13.6-27.4]% (28/141) | 64.3% [51.9-75.4]% (45/70) | 83.9% [74.8-90.7]% (78/93) |
| SCNA | train | 3.5% [1.7-6.4]% (10/284) | 21.2% [16.2-27.0]% (50/236) | 55.1% [47.0-63.1]% (86/156) | 79.6% [72.5-85.6]% (125/157) |
| SCNA | valid | 5.6% [2.6-10.4]% (9/160) | 15.6% [10.0-22.7]% (22/141) | 50.0% [37.8-62.2]% (35/70) | 63.4% [52.8-73.2]% (59/93) |
| SCNA-WBC | train | 5.6% [3.3-9.0]% (16/284) | 20.3% [15.4-26.0]% (48/236) | 55.8% [47.6-63.7]% (87/156) | 80.9% [73.9-86.7]% (127/157) |
| SCNA-WBC | valid | 4.4% [1.8-8.8]% (7/160) | 18.4% [12.4-25.8]% (26/141) | 54.3% [41.9-66.3]% (38/70) | 73.1% [62.9-81.8]% (68/93) |
| SNV | train | 2.1% [0.8-4.5]% (6/284) | 8.9% [5.6-13.3]% (21/236) | 31.4% [24.2-39.3]% (49/156) | 52.9% [44.8-60.9]% (83/157) |
| SNV | valid | 1.9% [0.4-5.4]% (3/160) | 11.3% [6.6-17.8]% (16/141) | 28.6% [18.4-40.6]% (20/70) | 38.7% [28.8-49.4]% (36/93) |
| SNV-WBC | train | 5.6% [3.3-9.0]% (16/284) | 23.3% [18.1-29.2]% (55/236) | 61.5% [53.4-69.2]% (96/156) | 84.1% [77.4-89.4]% (132/157) |
| SNV-WBC | valid | 8.8% [4.9-14.2]% (14/160) | 19.1% [13.0-26.6]% (27/141) | 57.1% [44.7-68.9]% (40/70) | 79.6% [69.9-87.2]% (74/93) |
| WG methylation | train | 7.7% [4.9-11.5]% (22/284) | 29.2% [23.5-35.5]% (69/236) | 65.4% [57.4-72.8]% (102/156) | 86.0% [79.6-91.0]% (135/157) |
| WG methylation | valid | 8.1% [4.4-13.5]% (13/160) | 19.9% [13.6-27.4]% (28/141) | 57.1% [44.7-68.9]% (40/70) | 82.8% [73.6-89.8]% (77/93) |
| classifier_name | train_or_valid | lod | p_detection | lod_lcb | lod_ucb | specificity | n_obs | study |
|---|---|---|---|---|---|---|---|---|
| Allelic imbalance | train | 0.007932 | 0.5 | 0.005645 | 0.01115 | SnAtSp=0.98 | 296 | CCGA1 |
| Allelic imbalance | valid | 0.007827 | 0.5 | 0.004465 | 0.01372 | SnAtSp=0.98 | 113 | CCGA1 |
| Fragment endpoints | train | 0.01205 | 0.5 | 0.008126 | 0.01786 | SnAtSp=0.98 | 296 | CCGA1 |
| Fragment endpoints | valid | 0.01937 | 0.5 | 0.009789 | 0.03833 | SnAtSp=0.98 | 113 | CCGA1 |
| Fragment lengths | train | 0.004095 | 0.5 | 0.002865 | 0.005854 | SnAtSp=0.98 | 296 | CCGA1 |
| Fragment lengths | valid | 0.003154 | 0.5 | 0.001835 | 0.005421 | SnAtSp=0.98 | 113 | CCGA1 |
| Pan-feature | valid | 0.000889 | 0.5 | 0.000585 | 0.001352 | SnAtSp=0.98 | 113 | CCGA1 |
| SCNA | train | 0.002665 | 0.5 | 0.001814 | 0.003916 | SnAtSp=0.98 | 296 | CCGA1 |
| SCNA | valid | 0.003947 | 0.5 | 0.002019 | 0.007718 | SnAtSp=0.98 | 113 | CCGA1 |
| SCNA-WBC | train | 0.001618 | 0.5 | 0.001144 | 0.00229 | SnAtSp=0.98 | 296 | CCGA1 |
| SCNA-WBC | valid | 0.0025 | 0.5 | 0.001425 | 0.004384 | SnAtSp=0.98 | 113 | CCGA1 |
| SNV | train | 0.01868 | 0.5 | 0.01199 | 0.02908 | SnAtSp=0.98 | 296 | CCGA1 |
| SNV | valid | 0.01634 | 0.5 | 0.008032 | 0.03325 | SnAtSp=0.98 | 113 | CCGA1 |
| SNV-WBC | train | 0.001257 | 0.5 | 0.000971 | 0.001628 | SnAtSp=0.98 | 296 | CCGA1 |
| SNV-WBC | valid | 0.001184 | 0.5 | 0.000746 | 0.001879 | SnAtSp=0.98 | 113 | CCGA1 |
| WG methylation | train | 0.000853 | 0.5 | 0.000652 | 0.001115 | SnAtSp=0.98 | 296 | CCGA1 |
| WG methylation | valid | 0.001241 | 0.5 | 0.000805 | 0.001913 | SnAtSp=0.98 | 113 | CCGA1 |
| Targeted methylation (Second CCGA substudy) | valid | 0.000131 | 0.5 | 0.000103 | 0.000166 | SnAtSp=0.98 | 559 | CCGA2 |
| classifier_name | train_or_valid | lod | lod_meth | lod_relative_to_meth |
|---|---|---|---|---|
| Allelic imbalance | valid | 0.007827 | 0.001241 | 6.307 |
| Fragment endpoints | valid | 0.01937 | 0.001241 | 15.61 |
| Fragment lengths | valid | 0.003154 | 0.001241 | 2.541 |
| Pan-feature | valid | 0.000889 | 0.001241 | 0.7164 |
| SCNA | valid | 0.003947 | 0.001241 | 3.18 |
| SCNA-WBC | valid | 0.0025 | 0.001241 | 2.015 |
| SNV | valid | 0.01634 | 0.001241 | 13.17 |
| SNV-WBC | valid | 0.001184 | 0.001241 | 0.9541 |
| WG methylation | valid | 0.001241 | 0.001241 | 1 |
| Targeted methylation (Second CCGA substudy) | valid | 0.000131 | 0.001241 | 0.1056 |
| wh_assay | n_correct | n_total | prec_overall |
|---|---|---|---|
| SCNA | 52 | 127 | 40.94 |
| SNV-WBC | 44 | 127 | 34.65 |
| WG methylation | 95 | 127 | 74.8 |
Overall CSO accuracy relative to Methylation:
WG methylation:SCNA = 182.7% (95/52)
WG methylation:SNV-WBC = 215.9% (95/44)
WG methylation:WG methylation = 100.0% (95/95)
| no | yes | |
|---|---|---|
| no | 27 | 5 |
| yes | 48 | 47 |
| no | yes | |
|---|---|---|
| no | 31 | 1 |
| yes | 52 | 43 |
| no | yes | |
|---|---|---|
| no | 51 | 24 |
| yes | 32 | 20 |
Compute correlation between cTAF and stage for each cancer type.
| cancer_type | estimate | n | p.value | p.value_adjust | stars |
|---|---|---|---|---|---|
| Lung | 0.4147 | 40 | 0.007793 | 0.007793 | ** |
| Colon/Rectum | 0.7702 | 48 | 1.552e-10 | 2.07e-10 | *** |
| Breast | 0.4837 | 162 | 7.034e-11 | 1.407e-10 | *** |
| Remaining | 0.5826 | 159 | 7.906e-16 | 3.163e-15 | *** |
McNemar P-values for difference in detection with each classifier and WG Methylation (Validation set only).
| cancer | non_cancer | |
|---|---|---|
| cancer | 97 | 61 |
| non_cancer | 4 | 302 |
| cancer | non_cancer | |
|---|---|---|
| cancer | 73 | 85 |
| non_cancer | 2 | 304 |
| cancer | non_cancer | |
|---|---|---|
| cancer | 140 | 18 |
| non_cancer | 15 | 291 |
| cancer | non_cancer | |
|---|---|---|
| cancer | 116 | 42 |
| non_cancer | 9 | 297 |
| cancer | non_cancer | |
|---|---|---|
| cancer | 132 | 26 |
| non_cancer | 7 | 299 |
| cancer | non_cancer | |
|---|---|---|
| cancer | 80 | 78 |
| non_cancer | 4 | 302 |
| cancer | non_cancer | |
|---|---|---|
| cancer | 126 | 32 |
| non_cancer | 10 | 296 |
| cancer | non_cancer | |
|---|---|---|
| cancer | 7 | 149 |
| non_cancer | 5 | 296 |
| cancer | non_cancer | |
|---|---|---|
| cancer | 155 | 3 |
| non_cancer | 10 | 296 |
All output figs except UpSet plots, which are written separately.
R version 4.1.2 (2021-11-01)
Platform: x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=C.UTF-8, LC_NUMERIC=C, LC_TIME=C.UTF-8, LC_COLLATE=C.UTF-8, LC_MONETARY=C.UTF-8, LC_MESSAGES=C.UTF-8, LC_PAPER=C.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=C.UTF-8 and LC_IDENTIFICATION=C
attached base packages: stats, graphics, grDevices, utils, datasets, methods and base
other attached packages: ggplot2(v.3.3.6)
loaded via a namespace (and not attached): Rcpp(v.1.0.9), lattice(v.0.20-45), tidyr(v.1.2.0), digest(v.0.6.29), packrat(v.0.6.0), utf8(v.1.2.2), R6(v.2.5.1), plyr(v.1.8.6), backports(v.1.4.1), evaluate(v.0.15), assertr(v.2.8), highr(v.0.9), pillar(v.1.8.0), rlang(v.1.0.4), scam(v.1.2-12), jquerylib(v.0.1.4), hexbin(v.1.28.2), Matrix(v.1.3-4), rmarkdown(v.2.8), textshaping(v.0.3.6), labeling(v.0.4.2), splines(v.4.1.2), readr(v.2.0.1), stringr(v.1.4.0), pander(v.0.6.3), bit(v.4.0.4), munsell(v.0.5.0), broom(v.1.0.0), compiler(v.4.1.2), xfun(v.0.31), systemfonts(v.1.0.4), pkgconfig(v.2.0.3), mgcv(v.1.8-36), htmltools(v.0.5.2), tidyselect(v.1.1.2), tibble(v.3.1.8), gridExtra(v.2.3), fansi(v.1.0.3), crayon(v.1.5.1), dplyr(v.1.0.9), tzdb(v.0.1.2), withr(v.2.5.0), ggpubr(v.0.2.3), grid(v.4.1.2), nlme(v.3.1-152), jsonlite(v.1.8.0), gtable(v.0.3.0), lifecycle(v.1.0.1), magrittr(v.2.0.3), pROC(v.1.16.2), scales(v.1.2.0), cli(v.3.3.0), stringi(v.1.7.8), vroom(v.1.5.4), cachem(v.1.0.6), farver(v.2.1.1), ggsignif(v.0.6.2), bslib(v.0.4.0), ragg(v.1.2.2), ellipsis(v.0.3.2), generics(v.0.1.3), vctrs(v.0.4.1), cowplot(v.1.1.1), tools(v.4.1.2), bit64(v.4.0.5), glue(v.1.6.2), purrr(v.0.3.4), hms(v.1.1.0), parallel(v.4.1.2), fastmap(v.1.1.0), yaml(v.2.3.5), colorspace(v.2.0-3), UpSetR(v.1.3.3), knitr(v.1.39) and sass(v.0.4.2)